专利摘要:
The present invention provides a method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing. The method of the present invention comprises the following steps: constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence; subjecting the high-throughput sequencing library to high-throughput sequencing, and analyzing the nucleotide sequence components according to the sequencing results; the sequence of the extension primer used in the construction of the high-throughput sequencing library consisting of the DNA molecule set forth in positions 1-22 of SEQ ID NO: 2 and N bases (A, T, C or G) in sequence; and N being an integer greater than or equal to 6. It is proved by experiments that the method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing of the present invention can quickly, accurately, and comprehensively analyze the purity and content of each component in the oligonucleotide sequence.
公开号:EP3680346A1
申请号:EP18857038.6
申请日:2018-08-23
公开日:2020-07-15
发明作者:Liangrang GUO;Peizhuo ZHANG;Lei Yu
申请人:Suzhou Genepharma Co Ltd;Suzhou Genesci Co Ltd;
IPC主号:C12N15-00
专利说明:
[0001] The present invention belongs to the field of biotechnology, and particularly relates to a method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing and application. BACKGROUND ART
[0002] In recent years, more and more pharmaceutical companies at home and abroad have been involved in gene medicines, investing heavily in the development of new gene medicines to combat various diseases. Antisense oligodeoxynucleotide technology is a treatment method that uses artificially synthesized or biosynthesized DNA or RNA that is complementary to RNA to block and inhibit the expression of genes related to disease occurrence, which can be used to treat tumors or genetic diseases caused by gene mutations.
[0003] Antisense technology is a new method of drug development. The drugs developed using this technology are called antisense drugs and involve antisense DNA, antisense RNA, and ribozyme. According to the principle of nucleic acid hybridization, antisense drugs can hybridize with specific genes and interfere with the production of pathogenic proteins at the gene level, i.e., interfere with the transmission of genetic information from nucleic acids to proteins. Traditional drugs mainly act directly on the pathogenic proteins themselves, while antisense drugs act on the genes that produce the proteins. Compared with traditional drugs, antisense drugs have higher selectivity and efficiency and can be widely used in the treatment of a variety of diseases, such as infectious diseases, inflammation, cardiovascular diseases and tumors.
[0004] Gene therapy refers to the introduction of foreign normal genes into target cells to correct or compensate for diseases caused by gene defects and abnormalities, so as to achieve therapeutic purposes. Broadly speaking, gene therapy can also include measures and new technologies to treat certain diseases at the DNA level.
[0005] Antisense DNA mainly refers to antisense oligodeoxynucleotide (AS-ODN). AS-ODN is a short single-stranded DNA fragment, which is artificially synthesized and complementary to one or more sites (complementary regions) of the target gene mRNA and can inhibit or reduce the expression of the target gene. AS-ODN can act on different targets: binding to double-stranded DNA to regulate transcription; binding to mRNA precursors or splice junctions to inhibit splicing of mRNA precursors and affect the transport of spliced mRNA from the nucleus to the cytoplasm; binding to mRNA in the cytoplasm to block translation; binding to specific proteins to regulate gene expression.
[0006] Since antisense DNA is an artificially synthesized sequence, there are base insertions and deletions during the synthesis process, resulting in a large amount of impurities in the synthesized DNA, and the synthesized sequence needs to be purified for pharmaceutical use, so it is necessary to analyze and confirm the purified sequence components. However, there is currently no good solution for the analysis of single-stranded oligonucleotide sequences. SUMMARY OF THE INVENTION
[0007] The technical problem to be solved by the present invention is how to quickly, accurately and comprehensively analyze the composition and purity and/or content of each component sequence in an artificially synthesized oligonucleotide sequence.
[0008] In order to solve the above technical problem, the present invention first provides a method for constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence.
[0009] The method for constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence provided by the present invention comprises the following steps:1) adding a poly tail to the 3' end of an oligonucleotide sequence to be detected to obtain a poly-tailed product; 2) performing reverse extension and amplification on the poly-tailed product to obtain a reverse extension and amplification product;the primers used for the reverse extension and amplification consisting of the oligonucleotide to be detected and an extension primer;the extension primer sequence consisting of the DNA molecule set forth in positions 1-22 of SEQ ID NO: 2 and N bases (A, T, C or G) in sequence; and N being an integer greater than or equal to 6; 3) precipitating the reverse extension and amplification product to obtain a precipitated product; 4) performing end repair, A-tailing reaction, adapter ligation and PCR amplification on the precipitated product sequentially to obtain a high-throughput sequencing library.
[0010] In the above method, the length of the oligonucleotide can be 8-120 bp. The oligonucleotide can be single-stranded DNA or double-stranded DNA. In a specific embodiment of the present invention, the oligonucleotide is single-stranded DNA, and its nucleotide sequence is SEQ ID NO: 5 with a size of 21 bp.
[0011] In the above method, in step 1), the poly tail can be a poly A tail, a poly G tail, a poly C tail, or a poly T tail; in a specific embodiment of the present invention, the poly tail is a poly A tail;the method of adding a poly tail to the 3' end of an oligonucleotide sequence to be detected is as follows: 1.5 µL of terminal transferase, 1 µL of the oligonucleotide to be detected, 0.5 µL of dATP (or dTTP or dCTP or dGTP) (25 µM), 4 µL of 5xTdT Buffer and nuclease-free water are mixed to obtain a reaction system (total volume is 20 µL). The final concentration of the oligonucleotide to be detected in the reaction system is 5 µM.
[0012] In the above method, in step 2), N can be any integer greater than or equal to 6. In a specific embodiment of the present invention, N is specifically 20. When N is 20, the extension primer sequence is the DNA molecule set forth in SEQ ID NO: 1 or the DNA molecule set forth in SEQ ID NO: 2 or the DNA molecule set forth in SEQ ID NO: 3 or the DNA molecule set forth in SEQ ID NO: 4. In a specific embodiment of the present invention, the extension primer sequence is the DNA molecule set forth in SEQ ID NO: 1.
[0013] The reverse extension and amplification reaction system (total volume is 50 µL) consists of 25 µL of 2xPhata Max Buffer, 2 µL of dNTPS (10 mM), 2 µL of the extension primer, 1 µL of the oligonucleotide to be detected, 1 µL of DNA polymerase I and nuclease-free water. The final concentration of the extension primer in the reverse system is 4 µM; the final concentration of the oligonucleotide to be detected in the reverse system is 2 µM.
[0014] In the above method, in step 3), the method used for the precipitation is sodium acetate precipitation; the method of subjecting the reverse extension and amplification product to sodium acetate precipitation is as follows: 3a) adding sodium acetate, absolute ethanol and glycogen to the reverse extension and amplification product; 3b) centrifuging, discarding the supernatant and collecting the precipitate; 3c) adding ethanol to the precipitate, centrifuging, discarding the supernatant, and collecting the precipitate.
[0015] In step 3a), 1/10 volume of sodium acetate, 2.5 volumes of absolute ethanol and 1 µL of glycogen are added to the reverse extension and amplification product; the pH of the sodium acetate is 5.2; the concentration of the glycogen is 20mg/mL;in step 3b), the centrifugation conditions are 12000 rpm for 30 minutes at 4 °C;in step 3c), the centrifugation conditions are 12000 rpm for 5 minutes at 4 °C; and the ethanol is 80% aqueous ethanol solution by volume;further included between steps 3a) and 3b) is a step of placing at -80 °C for 30 minutes; step 3c) is repeated once.
[0016] In the above method, in step 4), the method of performing end repair, A-tailing reaction, adapter ligation and PCR amplification on the precipitated product sequentially is as follows: 4a) performing end repair and A-tailing reaction on the precipitated product to obtain a repaired product; 4b) ligating an adapter to the repaired product to obtain an adapter-ligated product; 4c) performing PCR amplification on the adapter-ligated product to obtain an amplification product, i.e., a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence.
[0017] In step 4a), the method of performing end repair and A-tailing reaction on the precipitated product is as follows: 15 µL of the precipitated product, 3 µL of 10×End Repair Buffer, 2 µL of T4 DNA polymerase, 2 µL of T4 Polynucleotide Kinase (T4PNK), 0.5 µL of Klenow DNA polymerase I, 0.5 µL of Bst DNA Pol I large Fragment and 7 µL of nuclease-free water are mixed and reacted.
[0018] In step 4b), the method of ligating an adapter to the repaired product is as follows: 30 µL of the repaired product, 15 µL of 10×T4 DNA Ligase Buffer, 2 µL of T4 DNA Ligase, 2 µL of Y-shape adapter and 1 µL of nuclease-free water are mixed and reacted.
[0019] In step 4c), the method of performing PCR amplification on the adapter-ligated product is as follows: 5 µL of the adapter-ligated product, 2 µL of each primer, 1 µL of dNTP mix (10 mM), 25 µL of 2xPhanta Max Buffer, 1 µL of Phanta Max Super Fide DNA polymerase and 14 µL of nuclease-free water are mixed and reacted. The final concentration of each primer in the reaction system is 1 µM.
[0020] Further included between steps 4b) and 4c) is a purification step, and the purification can be performed by using magnetic beads.
[0021] Step 4c) is followed by a purification step, and the purification can be performed by using a silica gel column.
[0022] In order to solve the above technical problem, the present invention further provides a product.
[0023] The product of the present invention is any one of the following a1) -a3): a1) the extension primer; a2) a PCR reagent containing the extension primer of a1); a3) a kit containing the extension primer of a1) or the PCR reagent of a2).
[0024] In the above product, the final concentration of the extension primer in the PCR reagent is 0.1 to 100 µM. In a specific embodiment of the present invention, the final concentration of the extension primer in the PCR reagent is 100 µM.
[0025] Use of the above product for constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence also belongs to the protection scope of the present invention.
[0026] Use of the above product for analyzing impurities of an oligonucleotide sequence also belongs to the protection scope of the present invention.
[0027] In order to solve the above technical problem, the present invention further provides a method for analyzing impurities of an oligonucleotide sequence.
[0028] The method for analyzing impurities of an oligonucleotide sequence provided by the present invention comprises the following steps: (1) constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence according to the above method; (2) subjecting the high-throughput sequencing library to high-throughput sequencing, and analyzing the nucleotide sequence components according to the sequencing results.
[0029] In order to solve the above technical problem, the present invention finally provides a new use of the above method.
[0030] The present invention provides use of the above method for analyzing components of an artificially synthesized antisense oligonucleotide sequence for gene therapy.
[0031] The present invention further provides use of the above method for analyzing the purity and/or content of each component in an artificially synthesized antisense oligonucleotide sequence for gene therapy.
[0032] It is proved by experiments that the method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing of the present invention can quickly, accurately, and comprehensively analyze the purity and content of each component in an oligonucleotide sequence. DESCRIPTION OF THE DRAWINGS
[0033] FIG. 1 is a flowchart of a method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing. FIG. 2 shows the results of electrophoresis detection of the product precipitated with sodium acetate. FIG. 3 shows the results of electrophoresis detection of the PCR amplification product. FIG. 4 shows the results of electrophoresis detection of the purified PCR amplification product. FIG. 5 shows the results of high-throughput sequencing data analysis.DETAILED DESCRIPTION OF THE INVENTION
[0034] Unless otherwise specified, the experimental methods used in the following example are conventional methods.
[0035] The materials, reagents, etc. used in the following example are available commercially, unless otherwise specified.
[0036] In the following example, T4 DNA polymerase is a product from NEB (Beijing) LTD and the catalog number is M0203L. Both DNA Polymerase I and Large (Klenow) Fragment are products from NEB, and the catalog number is M0210S. Hereinafter, DNA Polymerase I and Large (Klenow) Fragment are referred to as DNA polymerase I. T4 Polynucleotide Kinase is a product from NEB, and the catalog number is M0201. Klenow DNA polymerase I and Bst DNA Pol I large Fragment are products from NEB, and the catalog number is M0275. T4 DNA Ligase is a product from NEB, and the catalog number is M0202L. Terminal transferase is a product from Thermo Fisher Scientific (China) Co., Ltd., and the catalog number is EP0162. Deoxyadenosine triphosphate is a product from Thermo Fisher Scientific (China) Co., Ltd., and the catalog number is 10216018. Phanta Max Super Fide DNA Polymerase is a product from Vazyme Biotech Co., Ltd., and the catalog number is P505.
[0037] The formulation of 10×End Repair Buffer in the following example is: the solutes and their concentrations are: 900 mM MgCl2, 30 mM DTT, 10 mM ATP, 1 µg/µL BSA and 4 mM dNTPs; the solvent is 500 mM Tris-HCl buffer (pH 8.3). Example 1. A method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencingI. Adding poly tail to oligonucleotide
[0038] 1. A poly A tail was added to the 3' end of the oligonucleotide sequence, and a tailing reaction system was prepared according to each reagent and its addition amount in Table 1. The oligonucleotide sequence is as follows: 5'-CAGAGCAGCTTGTCTTTCTTC-3' (SEQ ID NO: 5). The oligonucleotide sequence was synthesized by Shanghai Generay Biotech Co., Ltd. Table 1 Tailing reaction system Reagent Addition amount (µL) Nuclease-free water 13 Oligonucleotide (100µM) 1 dATP (25µM) 0.5 5×TdT Buffer 4 Terminal transferase 1.5 Total volume 20 µL 2. The reaction was performed at 37 °C for 25min.3. The reaction mixture was incubated at 70 °C for 10 min to inactivate the terminal transferase. II. Reverse extension and amplification
[0039] 1. Each of the reagents shown in Table 2 was added to the reaction product obtained in above step I to prepare a reverse extension and amplification system. Table 2 Reverse extension and amplification system Reagent Addition amount (µL) Nuclease-free water 19 2xPhanta Max Buffer 25 dNTPS (10mM) 2 Extension primer extpT (100µM) 2 Oligonucleotide (100µM) 1 DNA Polymerase I 1 Total volume 50 The sequence of the extension primer extpT is as follows: 5'-GAGACACGAATAGACGGCACGATTTTTTTTTTTTTTTTTTTT-3' (SEQ ID NO: 1).2. The reverse extension and amplification system prepared in step 1 was placed on a PCR instrument and the reaction procedure shown in Table 3 was performed. Table 3 Reverse extension and amplification procedure Segment 1 95°C for 30s Segment 2 (20 cycles) 95°C for 30s 60°C for 15s 72°C for 30s Segment 3 72°C for 10min Segment 4 Stored at 10°C III. Sodium acetate precipitation
[0040] 1. The product obtained in step II was added with 1/10 volume of sodium acetate (pH5.2), 2.5 volumes of absolute ethanol and 1µL of glycogen with a concentration of 20mg/mL.2. The mixture was kept at -80 °C for 30 minutes.3. The mixture was then centrifuged at 12000 rpm for 30 minutes at 4 °C and the supernatant was discarded and the precipitate was collected.4. 1 mL of 80% aqueous ethanol solution was added to the precipitate, and then centrifuged at 12000 rpm for 5 minutes at 4 °C. The supernatant was discarded and the precipitate was collected.5. Step 4 was repeated once.6. The precipitate was dried at room temperature and added with 30 µL of TE to dissolve.7. The precipitation product was detected. The specific steps were as follows: a 12% PAGE gel was prepared, the precipitation product was loaded and electrophoresed at 200 V for 40 minutes, stained in a dark box for 10 minutes, and photographed in a gel imaging system. The results of electrophoresis detection are shown in FIG. 2. IV. End repair
[0041] 1. An end repair reaction system was prepared according to each reagent and its addition amount in Table 4. After preparation, the reagents were mixed and subjected to instantaneous centrifugation. Table 4 End repair reaction system. Reagent Addition amount (µL) Nuclease-free water 7 10×End Repair Buffer 3 T4 DNA Polymerase 2 T4 Polynucleotide Kinase (T4 PNK) 2 Klenow DNA polymerase I 0.5 Bst DNA Pol I large Fragment 0.5 The product of step III 15 Total volume 30 2. The end repair reaction system in step 1 was placed on a PCR instrument and the reaction procedure shown in Table 5 was performed. Table 5 PCR reaction procedure Segment 1 20 □ for 30 min Segment 2 65 □ for 30 min Segment 3 Stored at 10°C V. Adapter ligation
[0042] 1. An adapter ligation reaction system was prepared according to each reagent and its addition amount in Table 6. After preparation, the reagents were mixed and subjected to instantaneous centrifugation. Table 6 Adapter ligation reaction system Reagent Addition amount (µL) End repair product 30 Nuclease-free water 11 10×T4 DNA Ligase Buffer 5 Y-shape adapter (40µM) 2 T4 DNA Ligase 2 Total volume 50 The above Y-shape adapter consists of UAF and AI5, and their sequences are as follows: UAF (the underlined base is thio-modified):
[0043] 1. Ampure XP magnetic beads (Agencourt AMPure XP Kit produced by Beckman Coulter, Inc., catalog number: A63880) were thoroughly mixed by shaking;2. 1×Ampure XP magnetic beads was added to the ligation product obtained in above step V, mixed 10 times with a pipette, and kept at room temperature for 1 min;3. The mixture was placed on a magnetic stand for 5 min and the supernatant was discarded.4. 200 µL of freshly prepared 80% aqueous ethanol solution was added to the magnetic beads, placed at room temperature for 30 s, and the supernatant was discarded;5. The above step 4 was repeated once;6. 50 µL of 10 mM Tris-HCl (pH 8.0) was added for elution, and the supernatant was transferred to a new centrifuge tube;7. One volume of magnetic beads was added and mixed 10 times with a pipette, and kept at room temperature for 1 min;8. 200 µL of freshly prepared 80% aqueous ethanol solution was added to the magnetic beads and kept at room temperature for 30 s, and the supernatant was discarded;9. The above step 8 was repeated once;10. The lid was opened and the tube was kept at room temperature for 10 min;11. 50 µL of 10 mM Tris-HCl (pH 8.0) was added and mixed with a pipette, and kept at room temperature for 1 min;12. The tube was placed on a magnetic stand for 5 min, and the supernatant was transferred to a new centrifuge tube, and the purified adapter-ligated product was obtained. VII. PCR amplification of adapter-ligated product
[0044] 1. Each of the reagents in Table 7 was added to the purified adapter-ligated product (the addition amounts were also shown in Table 7) to prepare a PCR amplification system. Table 7 PCR amplification system Reagent Addition amount (µL) Nuclease-free water 14 mpF (25µM) 2 mpRI5 (25µM) 2 dNTP mix (10mM) 1 2×Phanta Max Buffer 25 Phanta Max Super Fide DNA Polymerase 1 The sequence of the primer mPF is as follows (synthesized by Shanghai Generay Biotech Co., Ltd):
[0045] 1. The required bands in the above electrophoresis were cut out, and gel recovery was performed using the Agarose Gel DNA Recovery Kit (centrifugal column type: GK2042-50) from Shanghai Generay Biotech Co., Ltd.2. 400 µL of Binding Solution was added to the gel and placed in a 50 °C water bath until the gel block was dissolved.3. The mixture was shaken once every 2 minutes during this period.4. The dissolved gel block was transferred to a silica gel column, kept at room temperature for 2 min, and centrifuged at 6000 rpm for 1 min, and the waste solution was discarded.5. 500 µL of Washing Solution was added to the silica gel column, and kept at room temperature for 3 min.6. The mixture was centrifuged at 12000 rpm for 1 min and the waste solution was discarded.7. Step 6 was repeated once.8. The mixture was centrifuged at 12000 rpm for 1 min and the silica gel column was transferred to a new 1.5 mL centrifuge tube.9. 30 µL of nuclease-free water was added to the silica gel column, and kept at room temperature for 2 min.10. The mixture was centrifuged at 12000 rpm for 1 min and the supernatant was collected to obtain a purified PCR amplification product. The purified PCR amplification product was subjected to electrophoresis detection, and the results are shown in FIG. 4, wherein, M: 20bp DNA Ladder; 1: library gel recovery product. As can be seen from the figure, a high-throughput sequencing library was successfully obtained. IX. High-throughput sequencing and data analysis1. High-throughput sequencing
[0046] The library constructed in step VIII was sequenced using the Hiseq 3000 platform in a single-end 150bp sequencing mode. 2. Analysis of high-throughput sequencing results
[0047] The trimmomatic-0.33 software (the URL of the trimmomatic-0.33 software is as follows: https://www.usadellab.org/cms/index.php page=trimmomatic) was used to remove low-quality bases at the 3' end of the read, and the self-written per script ExtractValid.pl was used to extract the forward sequencing reads containing the adapter, the cutadapt1.2.1 software (the URL of the cutadapt1.2.1 software is as follows: https://github.com/marcelm/cutadapt/releases/tag/v1.2.1) was used to remove the Poly A adapter from the reads, the self-written per script FilterTN.pl was used to remove the reads containing N and Poly T adapters, the self-written perl script trim_polytail.pl was used to remove Poly tails with wrong bases added at the end due to impure dNTP, the self-written perl script FastQ_ReadFilterByLength.pl was used to filter too long or short reads, the fastx_collapser module in the FASTX Toolkit 0.0.13 software (the URL of the FASTX Toolkit 0.0.13 software is as follows: https://hannonlab.cshl.edu/fastx_toolkit/) was used to merge repeated sequences and to remove the reads with only one number. The purity of each component of the oligonucleotide was calculated according to the following formula: the ratio of each component (%) = the number of reads of the component / the sum of the number of reads of each component × 100%.
[0048] The results are shown in Table 9 and FIG.5. From the results, it can be seen that of the obtained 9664589 oligonucleotide sequences, there were 7830352 oligonucleotide sequences that were completely identical to the oligonucleotide sequence (CAGAGCAGCTTGTCTTTCTTC), and the ratio was 81.02%. Compared with the oligonucleotide sequence, there were 1331990 oligonucleotide sequences lacking a partial sequence at the 5' end, and the ratio was 13.78%; compared with the oligonucleotide sequence, there were 144439 oligonucleotide sequences lacking a partial sequence at the 3' end, and the ratio was 1.49%; compared with the oligonucleotide sequence, there were a total of 18697 oligonucleotide sequences lacking partial sequences both at the 5' end and 3' end, and the ratio was 0.19%; there were some other cases (such as insertion, deletion and incorporation of wrong bases, etc.) of oligonucleotide sequences with a total of 339111, and the ratio was 3.51%. The above results show that the method of the present invention can accurately and comprehensively analyze the content of each component in the oligonucleotide sequence and its ratio. Table 9 Data analysis results Category (compared with the oligonucleotide sequence) Number (of sequences) Ratio (%) Completely identical 7830352 81.02% Lacking a partial sequence at the 5' end 1331990 13.78% Lacking a partial sequence at the 3' end 144439 1.49% Lacking partial sequences both at the 5' end and 3' end 18697 0.19% Other cases 339111 3.51% Total 9664589Note: The oligonucleotide sequence is CAGAGCAGCTTGTCTTTCTTC. 3. Data analysis
[0049] The ncbi-blast-2.2.28 + software (the URL of ncbi-blast-2.2.28 + software is as follows: https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.28/) was first used to align with the oligonucleotide reference sequence; a self-written script was used to parse the alignment results, standardize the parsed results, classify the standardized results, and finally count the classified results.
[0050] Table 10 shows the analysis results of the components with a content of greater than 0.1% and their contents. The analysis results of N-1 reads and N+1 reads are shown in Tables 11 and 12. In the tables, the oligonucleotide sequence and content of each component whose content is greater than 0.1%, the content and ratio of each component in N-1 reads and N+1 reads are shown. Table 10 Components with a content of greater than 0.1% and their contents Component No. Component Ratio (%) 1-7830352 CAGAGCAGCTTGTCTTTCTTC 81.02% 2-209704 GCTTGTCTTTCTTC 2.17% 3-204757 AGAGCAGCTTGTCTTTCTTC 2.12% 4-182418 CAGCTTGTCTTTCTTC 1.89% 5-165497 GAGCAGCTTGTCTTTCTTC 1.71% 6-157170 GCAGCTTGTCTTTCTTC 1.63% 7-154399 AGCAGCTTGTCTTTCTTC 1.60% 8-133435 CTTGTCTTTCTTC 1.38% 9-122557 AGCTTGTCTTTCTTC 1.27% 10-85682 CAGAGCAGCTTGTCTTTCTT 0.89% 11-34227 CAGAGCAGCTTGTCTTTCTC 0.35% 12-18571 CAGAGCAGCTTGTCTTTCT 0.19% 13-18146 CAGAGCAGCTTGTCTTCTTC 0.19% 14-15125 CAGAGAGCTTGTCTTTCTTC 0.16% 15-13947 CGGAGCAGCTTGTCTTTCTTC 0.14% 16-13400 CTGAGCAGCTTGTCTTTCTTC 0.14% 17-11075 CAGAGCAGCTTGTCTTTC 0.11% 18-10938 CAAGCAGCTTGTCTTTCTTC 0.11% 19-9391 CGAGCAGCTTGTCTTTCTTC 0.10% Others 2.83% Table 11 Analysis results of N-1 reads No. Category (compared with the oligonucleotide sequence) Number of reads Ratio (%) 1 Reads lacking some bases at the 5'end or 3' end 290439 68.22% 2 Reads with incorrect bases 2447 0.57% 3 Reads with missing bases in the middle 76214 4 Reads with partial insertion or deletion at the 5' end or 3' end 57425 17.9% 5 Reads with inserted bases 137 13.49% 6 Total number of N-1 reads 425718 0.03% Note: The oligonucleotide sequence is CAGAGCAGCTTGTCTTTCTTC. Table 12 Analysis results of N+1 reads No. Category (compared with the oligonucleotide sequence) Number of reads Ratio (%) 1 Reads with incorrect bases 73 0.56 2 Reads with inserted bases 8571 65.32 3 Reads whose bases at the 5' end and 3' end cannot match 4568 34.81 4 Reads with missing bases in the middle 0 0 5 Total number of N+1 reads 13122Note: The oligonucleotide sequence is CAGAGCAGCTTGTCTTTCTTC. Industrial application
[0051] The present invention provides a method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing. The method of the present invention comprises the following steps: constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence; subjecting the high-throughput sequencing library to high-throughput sequencing, and analyzing the nucleotide sequence components according to the sequencing results; the sequence of the extension primer used in the construction of the high-throughput sequencing library consisting of the DNA molecule set forth in positions 1-22 of SEQ ID NO: 2 and N bases (A, T, C or G) in sequence; and N being an integer greater than or equal to 6. It is proved by experiments that the method for analyzing impurities of an oligonucleotide sequence based on high-throughput sequencing of the present invention can quickly, accurately, and comprehensively analyze the purity and content of each component in the oligonucleotide sequence.
权利要求:
Claims (13)
[0001] A method for constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence, comprising the following steps:
1) adding a poly tail to the 3' end of an oligonucleotide sequence to be detected to obtain a poly-tailed product;
2) performing reverse extension and amplification on the poly-tailed product to obtain a reverse extension and amplification product;the primers used for the reverse extension and amplification consisting of the oligonucleotide to be detected and an extension primer;the extension primer sequence consisting of the DNA molecule set forth in positions 1-22 of SEQ ID NO: 2 and N bases (A, T, C or G) in sequence; and N being an integer greater than or equal to 6;
3) precipitating the reverse extension and amplification product to obtain a precipitated product;
4) performing end repair, A-tailing reaction, adapter ligation and PCR amplification on the precipitated product sequentially to obtain a high-throughput sequencing library.
[0002] The method according to claim 1, wherein the extension primer sequence is the DNA molecule set forth in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4.
[0003] The method according to claim 2, wherein the extension primer sequence is the DNA molecule set forth in SEQ ID NO: 1.
[0004] The method according to claim 1, wherein the oligonucleotide is single-stranded DNA or double-stranded DNA.
[0005] The method according to claim 4, wherein the oligonucleotide is single-stranded DNA.
[0006] The method according to claim 1, wherein the length of the oligonucleotide is 8-120 bp.
[0007] A product, which is any one of the following a1) -a3):
a1) the extension primer in claim 1;
a2) a PCR reagent containing the extension primer of a1);
a3) a kit containing the extension primer of a1) or the PCR reagent of a2).
[0008] The product according to claim 7, wherein the final concentration of the extension primer in the PCR reagent is 0.1-100 µM.
[0009] Use of the product according to claim 7 or 8 for constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence.
[0010] Use of the product according to claim 7 or 8 for analyzing impurities of an oligonucleotide sequence.
[0011] A method for analyzing impurities of an oligonucleotide sequence, comprising the following steps:
(1) constructing a high-throughput sequencing library for analysis of impurities of an oligonucleotide sequence according to the method of any one of claims 1-6;
(2) subjecting the high-throughput sequencing library to high-throughput sequencing, and analyzing the nucleotide sequence components according to the sequencing results.
[0012] Use of the method according to claim 11 for analyzing components of an artificially synthesized antisense oligodeoxynucleotide sequence for gene therapy.
[0013] Use of the method according to claim 11 for analyzing the purity and/or content of each component in an artificially synthesized antisense oligodeoxynucleotide sequence for gene therapy.
类似技术:
公开号 | 公开日 | 专利标题
US10662421B2|2020-05-26|Methods and compositions for the extraction and amplification of nucleic acid from a sample
Enderle et al.2015|Characterization of RNA from exosomes and other extracellular vesicles isolated by a novel spin column-based method
Blevins et al.2015|Identification of Pol IV and RDR2-dependent precursors of 24 nt siRNAs guiding de novo DNA methylation in Arabidopsis
US9879312B2|2018-01-30|Selective enrichment of nucleic acids
Legnini et al.2017|Circ-ZNF609 is a circular RNA that can be translated and functions in myogenesis
EP2906715B1|2017-07-26|Compositions, methods, systems and kits for target nucleic acid enrichment
Adey et al.2012|Ultra-low-input, tagmentation-based whole-genome bisulfite sequencing
JP2018501776A|2018-01-25|Dislocations that maintain continuity
Crosetto et al.2013|Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing
Carninci et al.1996|High-efficiency full-length cDNA cloning by biotinylated CAP trapper
Hay et al.1982|Attenuation in the control of SV40 gene expression
ES2704701T3|2019-03-19|New protocol for preparing sequencing libraries
Lu et al.2007|Construction of small RNA cDNA libraries for deep sequencing
US10731152B2|2020-08-04|Method for controlled DNA fragmentation
EP0554034B1|1996-10-30|Shelf-stable product and process for isolating RNA, DNA and proteins
Tost et al.2002|Genotyping single nucleotide polymorphisms by mass spectrometry
US20200239869A1|2020-07-30|Analytical hplc methods
KR101974492B1|2019-05-02|Method for determining the presence or absence of different aneuploidies in a sample
US20170211140A1|2017-07-27|Adapters, methods, and compositions for duplex sequencing
Belyavsky et al.1989|PCR-based cDNA library construction: general cDNA libraries at the level of a few cells
EP2828218B9|2021-04-07|Methods of lowering the error rate of massively parallel dna sequencing using duplex consensus sequencing
Borodina et al.2011|A strand-specific library preparation protocol for RNA sequencing
CN107250447B|2020-05-05|Long fragment DNA library construction method
Linnarsson2010|Recent advances in DNA sequencing methods–general principles of sample preparation
CN101967476B|2012-11-14|Joint connection-based deoxyribonucleic acid | polymerase chain reaction |-free tag library construction method
同族专利:
公开号 | 公开日
US20210095270A1|2021-04-01|
EP3680346A4|2021-05-19|
CN109517889A|2019-03-26|
JP2020534868A|2020-12-03|
WO2019052322A1|2019-03-21|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
法律状态:
2019-03-22| STAA| Information on the status of an ep patent application or granted ep patent|Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
2020-06-12| STAA| Information on the status of an ep patent application or granted ep patent|Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
2020-06-12| PUAI| Public reference made under article 153(3) epc to a published international application that has entered the european phase|Free format text: ORIGINAL CODE: 0009012 |
2020-07-15| 17P| Request for examination filed|Effective date: 20200318 |
2020-07-15| AK| Designated contracting states|Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
2020-07-15| AX| Request for extension of the european patent|Extension state: BA ME |
2020-12-23| DAV| Request for validation of the european patent (deleted)|
2020-12-23| DAX| Request for extension of the european patent (deleted)|
2021-05-19| A4| Supplementary search report drawn up and despatched|Effective date: 20210421 |
2021-05-19| RIC1| Information provided on ipc code assigned before grant|Ipc: C12Q1/68 20180101AFI20210415BHEP Ipc: C12Q1/686920180101ALI20210415BHEP Ipc: C12Q1/687620180101ALI20210415BHEP Ipc: C12Q1/680620180101ALI20210415BHEP Ipc: C40B 50/06 20060101ALI20210415BHEP Ipc: C12Q1/684820180101ALI20210415BHEP |
优先权:
申请号 | 申请日 | 专利标题
[返回顶部]